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n, m numbers of decision variables and functional constraints 

x decision variable in a primal problem, x G W 1 

A decision variable in the dual problem, A G M. m 

f(x) objective function in a primal problem 

x G X regional constraints, ICR" 

g(x) < b, g(x) = b functional constraints, g : M. n — > R m 

z slack variable, z G IR TO 

Ax < 6, Ax + z = b linear constraints 

c T x linear objective function 

L(x, A) Lagrangian, L(x, A) = f(x) — X T (g(x) — b) 

A G Y Y = {A : min x€ x L(x, A) > — oo}. 

B, N sets of indices of basic and non-basic components of x. 

A,p,q pay-off matrix and decision variables in a matrix game 

Xij flow on arc 

v, C(S, S) value of a flow, value of cut (5, S) 

c~pclj minimum/ maximum allowed flows on arc j) 

dij costs per unit flow on arc 

Si,dj source and demands amounts in transportation problem 

\i, fij node numbers in transportation algorithm 
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1 Preliminaries 



1.1 Linear programming 

Consider the problem P. 

P: maximize x\ + x<i 

subject to x\ + 2^2 < 6 
%i — %2 < 3 

Xi,X2 > 



This is a completely linear problem - the objective function and all constraints are 
linear. In matrix/vector notation we can write a typical linear program (LP) as 

P: maximize c T x s.t. Ax < b, x > 0, 



1.2 Optimization under constraints 

The general type of problem we study in this course takes the form 

maximize f(x) 
subject to g(x) = b 
x G X 

where 

x G MJ 1 (n decision variables) 
/ : M. n — > R (objective function) 

X C R n (regional constraints) 
g : R n — > M m (m functional equations) 
6 G M w 

Note that minimizing f(x) is the same as maximizing —f(x). We will discuss various 
examples of constrained optimization problems. We will also talk briefly about ways 
our methods can be applied to real- world problems. 



1.3 Representation of constraints 

We may wish to impose a constraint of the form g(x) < b. This can be turned into 
an equality constraint by the addition of a slack variable z. We write 

g(x) + z = 6, z > 0. 

It is frequently mathematically convenient to turn all our constraints into equalities. 
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We distinguish between functional constraints of the form g(x) = b and re- 
gional constraints of the form x G X. 

Together, these define the feasible set for x. Typically, 'obvious' constraints like 
x > are catered for by defining X in an appropriate way and more complicated 
constraints, that may change from instance to instance of the problem, are expressed 
by functional constraints g(x) = b or g(x) < b. 

Sometimes the choice is made for mathematical convenience. Methods of solution 
typically treat regional and functional constraints differently. 

The shaded region shown below is the feasible set defined by the constraints for 
problem P. 



The feasible set for P is a convex set. 
1.4 Convexity 

Definition 1.1. A set S C M. n is a convex set if x,y G S Xx + (1 — X)y G S 

for all x,y G S and < A < 1. 

In other words, the line segment joining x and y lies in S. 



X! = 



X\ — x 2 = 3 




f- x 2 = 



xi + 2x 2 = 6 




convex 



not convex 



The following theorem is easily proved. 



Theorem 1.1. The feasible set of a LP problem is convex. 
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For functions defined on convex sets we make the following further definitions. 

Definition 1.2. A function f : S — >• R is a convex function if the set above its 
graph is convex. Equivalently, if 

Xf(x) + (1 - X)f(y) > f(Xx + (1 - X)y), for all < A < 1. 

A function f is a concave function if —f is convex. 



f 




X X 



In a general problem of minimizing a general / over a general S there may be 
local minima of / which are not global minimal. It is usually difficult to find the 
global minimum when there are lots of local minima. 

This is why convexity is important: if S is a convex set and f is a convex function 
then any local minimum of f is also a global minimum. 

A linear function (as in LP) is both concave and convex, and so all local optima 
of a linear objective function are also global optima. 



1.5 Extreme points and optimality 

Notice that in problem P the optimum of c T x occurs at a 'corner' of the feasible set, 

regardless of what is the linear objective function. In our case, c T = (1, 1) and the 

maximum is at corner C. 

objective function 

*X all solutions on this edge are optimal 
including the two endpoints 

feasible set>^ 



If the objective function is parallel to an edge, then there may be other optima on 
that edge, but there is always an optimum at a corner. This motivates the following 
definition. 

Definition 1.3. We say that x is an extreme point of a convex set S if whenever 
x = 9y + (1 — 9)z, for y,zES,0<9<l, then x = y = z. 
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In other words, x is not in the interior of any line segment within S. 

Examples of extreme points of two convex sets 

corners are the only extreme points 

V 



all boundary points 
are extreme 

t 





""not extreme 



Theorem 1.2. If an LP has a finite optimum it has an optimum at an extreme point 
of the feasible set. 

For LP problems the feasible set will always have a finite number of extreme points 
(vertices). The feasible set is 'polyhedral', though it may be bounded or unbounded. 
This suggests the following algorithm for solving LPs. 



Algorithm: 



1. Find all the vertices of the feasible set. 

2. Pick the best one. 



This will work, but there may be very many vertices. In fact, for Ax < 6, x > 0, 
there can be ( n ^ m ) vertices. So if m = n, say, then the number of vertices is of order 
(2n) n , which increases exponentially in n. This is not a good algorithm! 



1.6 *Efficiency of algorithms* 

There is an important distinction between those algorithms whose running times (in 
the worst cases) are exponential functions of 'problem size', e.g., (2n) n , and those 
algorithms whose running times are polynomial functions of problem size, e.g., n k . 
For example, the problem of finding the smallest number in a list of n numbers is 
solvable in polynomial-time n by simply scanning the numbers. There is a beautiful 
theory about the computational complexity of algorithms and one of its main 
messages is that problems solvable in polynomial-time are the 'easy' ones. 

We shall be learning the simplex algorithm, due to Dantzig, 1947. In worst- 
case instances it does not run in polynomial-time. In 1974, Khachian discovered a 
polynomial-time algorithm for general LP problems (the ellipsoid method). It is of 
mainly theoretical interest, being slow in practice. In 1984, Karmarkar discovered a 
new polynomial-time algorithm (an interior point method) that competes in speed 
with the simplex algorithm. 

In contrast, no polynomial-time algorithm is known for general integer LP, in 
which x is restricted to be integer-valued. ILP includes important problems such 
as bin packing, job-shop scheduling, traveling salesman and many other essentially 
equivalent problems of a combinatorial nature. 
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2 Lagrangian Methods 

2.1 The Lagrangian sufficiency theorem 

Suppose we are given a general optimization problem, 

P: minimize f(x) s.t. g(x) = 6, x E X, 
with x G M. n , b G M. m (n variables and m constraints). The Lagrangian is 

L(x,\) = f(x)-\ T (g(x)-b), 

with A G M. m (one component for each constraint). Each component of A is called a 
Lagrange multiplier. 

The following theorem is simple to prove, and extremely useful in practice. 

Theorem 2.1 (Lagrangian sufficiency theorem). If x* and A* exist such that x* is 
feasible for P and 

L{x\\*) < L(x,\*) G X, 

then x* is optimal for P. 
Proof. Define 

Xt, = {x : x G X and g(x) = b}. 
Note that X^ C X and that for any i G Ij 

L(x J X) = f(x)-X T (g(x)-b) = f(x). 

Now L(x*, A*) < L(x : A*) for all x G X, by assumption, and in particular for x G 
So 

/(x*) = L(x*, A*) < L(x, A*) = f(x), for all x G X 6 . 
Thus is optimal for P. □ 

Remarks. 

1. Note the 'If which starts the statement of the theorem. There is no guaran- 
tee that we can find a A* satisfying the conditions of the theorem for general 
problems P. (However, there is a large class of problems for which A* do exist.) 

2. At first sight the theorem offers us a method for testing that a solution x* is 
optimal for P without helping us to find x* if we don't already know it. Certainly 
we will sometimes use the theorem this way. But for some problems, there is a 
way we can use the theorem to find the optimal x*. 
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2.2 Example: use of the Lagrangian sufficiency theorem 
Example 2.1. 

minimize x\ — x 2 — 2x% 

S.t. X\ + X2 + X'i = 5 

x\ + x\ = 4 

Solution. Since we have two constraints we take A G R 2 and write the Lagrangian 
L(x J X) = f(x)-X T (g(x)-b) 

= x\ — x 2 — 2x 3 — Ai(xi + x 2 + x s — 5) — A 2 (ir 2 + a; 2 , — 4) 

= xi(l - Ai) - A 2 £ 2 + x 2 (-l - Ai) - A 2 ^2 



+ 



z 3 (2 + Ai) 



+ 5Ai + 4A 2 . 



We first try to minimize L(x, A) for fixed A in x G R 3 . Notice that we can minimize 
each square bracket separately. 

First notice that —2:3(2 + Ai) has minimum —00 unless Ai = —2. So we only want 
to consider Ai = —2. 

Observe that the terms in X\,X2 have a finite minimum only if A 2 < 0, in which 
case the minimum occurs at a stationary point where, 

dL/dx 1 = 1 - Ai - 2A 2 xi = =^ x 1 = 3/2A 2 

dL/dx 2 = -1 - Ai - 2X2X2 = =^x 2 = l/2A 2 . 

Let Y be the set of (Ai, A 2 ) such that L(x, A) has a finite minimum. So 

Y = {X : X 1 = -2, A 2 < 0}, 

and for A G Y the minimum of L(x,X) occurs at x(X) = (3/2A 2 , 1/2A 2 ,X3) T . 
Now to find a feasible x(X) we need 



9 1 

+ — = 4 



4A 2 



4A 2 



* A 2 = -v / V8. 



So x\ = —3-^/2/5, x 2 = —^ l /2/5 and £3 = 5 — xi — rc 2 = 5 + 4^2/5. 
The conditions of the Lagrangian sufficiency theorem are satisfied by 

T / , \ T 



X = 



(-3 V / 275,-V / 275>5 + 4 
So x* is optimal. 




and A* = (-2,-y/K/8^ 
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2.3 Strategy to solve problems with the Lagrangian sufficiency theorem 

Attempt to find x*, X* satisfying the conditions of the theorem as follows. 

1. For each A solve the problem 

minimize L(x, A) subject to x G X. 

Note that the only constraints involved in this problem are iGXso this should 
be an easier problem to solve than P. 

2. Define the set 

Y = {A : minL(x, A) > — oo}. 

If we obtain — oo for the minimum in step 1 then that A is no good. We consider 
only those X E Y for which we obtain a finite minimum. 

3. For A G Y", the minimum will be obtained at some x(X) (that depends on A in 
general). Typically, x(X) will not be feasible for P. 

4. Adjust A G Y so that x(X) is feasible. If A* G Y exists such that x* = x(X*) is 
feasible then x* is optimal for P by the theorem. 



2.4 Example: further use of the Lagrangian sufficiency theorem 
Example 2.2. 

... 1 1 
minimize 1 s.t. x\ + x 2 = o, xi,%2 > 0. 

1+Xi 2 + X2 

Solution. We define X = {x : x > 0} and the Lagrangian 

L(x,X) = — — h — X{xi+x 2 -b) 

l + Xi l + x 2 

Xxi) + f Xx2 J + A6. 



Note that 



J \2 + X2 



Xxi) and ( Xx^ 



\l + xi J \2 + x 2 

do not have a finite minimum in x > unless A < 0. So we take A < 0. Observe that 
in the range x > a function of the form {-A^. — Ax) will have its minimum either at 
x = 0, if this function is increasing at 0, or at the stationary point of the function, 
occurring where x > 0, if the function is decreasing at 0. So the minimum occurs at 

/ < 

, ^ as y—l/X a 



-a 
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so defining c + = max(0, c), 



x 



(A) = (~a + a/=1A) 



At first sight it appears that we don't know which values of 2a, £2 to substitute 
into the constraint until we know A, and we don't know A until we substitute 2a,2?2 
into the constraint. But notice that 2a (A) + 272(A) satisfies 

2a(A) + x 2 (X) = (-1 + v / -1A) + + (-2 + V^l/xf 







< -1 



-1 + 1/v^A as A G [-1,-1/4] 
-3 + 2/v^A G [-1/4,0] 

So we can see that 2a (A) +£2 (A) is an increasing and continuous function (although 
it is not differentiable at A = — 1 and at A = —1/4). 




xi(A) +x 2 (A) 



-1 



A 







Thus (by the Intermediate Value Theorem) for any b > there will be a unique 
value of A, say A*, for which 2?i(A*) + £2(A*) = b. This A* and corresponding x* will 
satisfy the conditions of the theorem and so x* is optimal. ■ 

Examples of this kind are fairly common. 
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3 The Lagrangian Dual 

3.1 The Lagrangian dual problem 

We have defined the set 



For A G Y define 



Y = {A : minL(a:, A) > — oo}. 



L(A) = min L(x, A). 

x<EX 



The following theorem is almost as easy to prove as the sufficiency theorem. 
Theorem 3.1 (weak duality theorem). For any feasible i 6 Ij and any A G Y 

L(A) < f(x). 

Proof. For x G X b , A G Y, 



f(x) = L(x, A) > minL(x, A) > min L(i£, A) = L(X). 

xex b xex 



□ 



Thus, provided the set Y is non-empty, we can pick any A G Y, and observe that 
L(A) is a lower bound for the minimum value of the objective function f(x). 

We can now try and make this lower bound as great as possible, i.e., let us consider 
the problem 

D: maximize L(X) subject to A G Y, 

equivalently, 

D: maximize < minL(x, A) > . 
XeY [xeX ') 

This is known as the Lagrangian dual problem. The original problem is called 
the primal problem. The optimal value of the dual is < the optimal value of the 
primal. If they are equal (as happens in LP) we say there is strong duality. 

Notice that the idea of a dual problem is quite general. For example, we can look 
again at the two examples we just studied. 

Example 3.1. In Example 2.1 we had Y = {A : Ai = — 2,A2 < 0} and that 
min^x L(x : X) occurred for x(X) = (3/2A2, 1/2A2, £3). Thus 

L(A) = L(x(A), A) = ^- - 10 + 4A 2 . 

4A2 



The dual problem is thus 



r 10 

maximize < — 10 + 4A9 

a 2 <o I4A 2 



The max is at A2 = — y 5/8, and the primal and dual have the same optimal value, 
namely -2( v / 10 + 5). 



Example 3.2. In Example 2.2, Y = {A : A < 0}. By substituting the optimal value 
of x into L(x, A) we obtain 

I3/2 + A6 < -1 

l/2 + 2 v /z A+(6+l)A as A G [—1,-1/4] 
4 v /z A + (6 + 3)A G [-1/4,0] 

We can solve the dual problem, which is maximize L(A) s.t. A < 0. The solution lies 
in -1 < A < -1/4 if < b < 1 and in -1/4 < A < if 1 < b. You can confirm that 
for all b the primal and dual here have the same optimal values. 

3.2 The dual problem for LP 

Construction of the dual problem for LP is straightforward. Consider the primal 
problem P: 

maximize c T x 
subject to Ax < b, x > 
equivalently Ax + z = 6, x, z > 0. 

Write the Lagrangian 

L(x, z, A) = c T x - X T (Ax + z - b) = (c T - X T A)x - X T z + A T 6. 

As in the general case, we can find the set Y such that A G Y implies 
max^^o L(x, z, A) is finite, and for A G Y we compute the minimum of L(X). 

Consider the linear term — X T z. If any coordinate Xi < we can make — XiZi as 
large as we like, by taking z% large. So there is only a finite maximum in z > if 
Xi > for all i. 

Similarly, considering the term (c T — X T A)x, this can be made as large as we like 
unless (c T - X T A) t < for all i. Thus 

Y = {X : A > 0, X T A-c T > 0}. 

If we pick a A G Y then max z >o — X T z = (by choosing Zi = if Xi > and any z\ if 
X{ = 0) and also max a; >o(c T — X T A)x = similarly. Thus for A G Y", L(A) = A T 6. So 
a pair of primal P, and dual D is, 

P: maximize c T x s.t. Ax < 6, x > 
D: minimize A T 6 s.t. X T A > c T , A > 0. 
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Notice that D is itself a linear program. For example, 

P: maximize x\ + a: 2 

subject to x\ + 2^2 < 6 
%i — %2 < 3 

Xl,X2 > 

D: minimize 6A1 + 3A2 

subject to Ai + A2 > 1 
2Ai - A 2 > 1 
Ai,A 2 > 

Furthermore, we might write D as 

D: maximize (-6) T A s.t. (-^4) T A < (-c), A > 0. 

So D is of the same form as P, but with c — > —b, b — > — c, and A — > —A T . This 
means that the dual of D is P, and so we have proved the following lemma. 

Lemma 3.2. In linear programming, the dual of the dual is the primal. 

3.3 The weak duality theorem in the case of LP 

We can now apply Theorem 3.1 directly to P and D to obtain the following. 

Theorem 3.3 (weak duality theorem for LP). If x is feasible for P (so Ax < b, 

x > 0) and A is feasible for D (so A > 0, ^4 T A > c) then c T x < X T b. 

Since this is an important result it is worth knowing a proof for this particular 
case which does not appeal to the general Theorem 3.1. Naturally enough, the proof 
is very similar. 

Proof. Write 

L(x, z, A) = c T x — X T (Ax + z — b) 
where Ax + z = 6, z > 0. Now for x and A satisfying the conditions of the theorem, 

c T x = L{x,z,X) = (c T - X T A)x - X T z + A T 6 < A T 6. 

□ 
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3.4 Sufficient conditions for optimality 

Theorem 3.3 provides a quick proof of the sufficient conditions for optimality of x*, z* 
and A* in a P and D. 

Theorem 3.4 (sufficient conditions for optimality in LP). If x*, z* is feasible for 
P and A* is feasible for D and (c T — \* T A)x* = \* T z* = (complementary slackness) 
then x* is optimal for P and A* is optimal for D. Furthermore c T x* = A* T 6. 

Proof. Write L(x* : z*, A*) = c T x* - \* T {Ax* + z* -b). Now 

c T x* = L(x*,z*,\*) 

= (c T - \* T A)x* - A* V + A* T 6 

= A* T 6 

But for all x feasible for P we have c T x < X* T b (by the weak duality theorem 3.3) 
and this implies that for all feasible x, c T x < c T x*. So x* is optimal for P. Similarly, 
A* is optimal for D (and the problems have the same optimums). □ 

The conditions (c T — \* T A)x* = and A* T z* = are called complementary 
slackness conditions. 

3.5 The utility of primal-dual theory 

Why do we care about D instead of just getting on with the solution of P? 

1. It is sometimes easier to solve D than P (and they have the same optimal values). 

2. For some problems it is natural to consider both P and D together (e.g., two 
person zero-sum games, see Lecture 9). 

3. Theorem 3.4 says that for optimality we need three things: primal feasibility, 
dual feasibility and complementary slackness. 

Some algorithms start with solutions that are primal feasible and work towards 
dual feasibility. Others start with dual feasibility. Yet others (in particular 
network problems) alternately look at the primal and dual variables and move 
towards feasibility for both at once. 
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4 Solutions to Linear Programming Problems 



4.1 Basic solutions 



Let us return to the LP problem P in Lecture 1 and look for a more algebraic (less 
geometric) characterisation of the extreme points. Let us rewrite P with equality 
constraints, using slack variables. 

P: maximize X\ + X2 

subject to x\ + 2x2 + Z\ = 6 

X\ - X2 + Z2 = 3 

x\,xi,z\,zi > 

Let us calculate the value of the variables at each of the 6 points marked A-F in 
our picture of the feasible set for P. The values are: 





X\ 


X2 




Z2 


/ 


A 








6 


3 





B 


3 





3 





3 


C 


4 


1 








5 


D 





3 





6 


3 


E 


6 








-3 


6 


F 





-3 


12 





-3 



At each point there are two zero and two non-zero variables. This is not surprising. 

Geometrically: The 4 lines defining the feasible set can be written x\ = 0; X2 = 0; 
z\ = 0; Z2 = 0. At the intersection of each pair of lines, two variables are zero. 

Algebraically: Constraints Ax + z = b are 2 equations in 4 unknowns. If we choose 
2 variables (which can be done in L) = 6 ways) and set them equal to zero we 
will be left with two equations in the other two variables. So (provided A and 
b are 'nice') there will be a unique solution for the two non-zero variables. 

Instead of calling the slack variables z\ and z<i , let us call them x<$ and x^ so that 
we can write P as 



P: maximize 
subject to 



Xi + X2 

Ax = b 
x > 



Note we have to extend A to 



/ Xi \ 

1 2 1 0\ x 2 

and x to 

y 1 — 1 1 J Xi 

\x A J 

A is (m x n) with n > m and there are m equations in n > m unknowns. We can 



13 



choose n — m variables in ( "j ways. Set them to zero. There is a unique solution to 
Ax = b for the remaining m variables (provided A and b are 'nice'). 

Definition 4.1. 

• ^4 basic solution to Ax = b is a solution with at least n — m zero variables. 

• A basic solution is non-degenerate if exactly n — m variables are zero. 

• The choice of the m non-zero variables is called the basis. Variables in the basis 
are called basic; the others are called non-basic. 

• If a basic solution satisfies x > then it is called a basic feasible solution. 

So A-F are basic solutions (and non-degenerate) and A-D are basic feasible solu- 
tions. Henceforth, we make an assumption. 

Assumption 4.1. The m x n matrix A has the property that 

• The rank of A is m. 

• Every set of rn columns of A is linearly independent. 

• If x is a b.f.s. of P, then x has exactly m non-zero entries, (non- degeneracy) 

Theorem 4.1. Basic feasible solutions = extreme points of the feasible set. 

Proof. Suppose rc is a b.f.s. Then x has exactly m non-zero entries. Suppose x = 
9y + (1 — 9)z for feasible y, z and < 9 < 1. Then if the ith entry of x is non-basic 
then Xi = and hence yi = Z{ = 0, since y^Zi > 0. This means both y and z have 
at least n — m zero entries. The equation Ay = b = Az implies A(y — z) = 0. Since 
at most m entries of y — z are non-zero, and any set of m columns of A are linearly 
independent, we have y = z, and x is an extreme point as claimed. 

Now suppose x is feasible and extreme but not basic. Then x has r(> m) non-zero 
entries, say xi x , . . . , x , r > 0. Let ai denote the ith column of A. Since the columms 
a^, . . . , ai r are linearly dependent, there exists non-zero numbers y^, . . . , yi r such that 

y ix a lx h yi r a lr = 0. Set yi = if % ^ i h . . . , i r . Now Ay = 0, so A(x ± ey) = b. 

We can choose e > small enough so that both x + ey and x — ey are feasible. Hence, 
we have succeeded in expressing a; as a convex combination of two distinct points of 
X(, since x = + ey) + \{x — ey). That is, x is not extreme. □ 

Theorem 4.2. // an LP has a finite solution then there is an optimal basic feasible 
solution. 

Proof. Suppose x is an optimal solution, but is not basic. Then there exists nonzero 
y s.t. Xi = =4> yi = and Ay = 0. Consider x(e) = x + ey. Clearly there exist 
some e chosen positive or negative so that c T x(e) > c T x, and such that x(e) > 0, 
and Ax(e) = Ax < 6, but x(e) has fewer nonzero entries than x. □ 
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Taking Theorems 4.1 and 4.2 together, we have proved Theorem 1.2. So we can 
do algebra instead of drawing a picture (which is good for a computer, and good for 
us if there are many variables and constraints.) A simple (and foolish) algorithm can 
be stated: 



Algorithm: 



1. Find all the basic solutions. 

2. Test to see which are feasible. 

3. Choose the best basic feasible solution. 



Unfortunately it is not usually easy to know which basic solutions will turn out to 
be feasible before calculating them. Hence, even though there are often considerably 
fewer basic feasible solutions we will still need to calculate all ( n ) basic solutions. 



4.2 Primal-dual relationships 

Now let us look at problem D which can be written, after introducing slack variables 
v\ and vo as 

Ai = 



D: minimize 6Ai + 3A2 

subject to Ai + A2 — v\ = 1 

2Ai — A2 — i>2 = 1 

Ai,A 2 ,^i,^2 > 



A 5 





\c/ 












/ \ v x = 


J 







Ao = 



The value of the variables, etc., at the points A-F in P (as above) and D are: 





X\ 


X2 




Z2 


/ 






Vl 


V2 


Ai 


A 2 


/ 


A 








6 


3 







A 


-1 


-1 











B 


3 





3 





3 




B 





-2 





1 


3 


C 


4 


1 








5 


D: 


C 








2 
3 


1 

3 


5 


D 





3 





6 


3 




D 


1 

2 





1 

2 





3 


E 


6 








-3 


6 




E 





1 


1 





6 


F 





-3 


12 





-3 




F 


-2 








-1 


-3 



Observe, that for D, as for P above, there are two zero and two non-zero variables 
at each intersection (basic solution). C and E are feasible for D. The optimum is at 
C with optimum value 5 (assuming we are minimizing and the other basic solutions 
are not feasible.) 
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We make the following observations by comparing lists of basic solutions for P 
and D. 

1. For each basic solution for P there is a corresponding basic solution for D. 
[Labels A-F have been chosen so that corresponding solutions have the same 
labels.] Each pair 

(a) has the same value of the objective function. 

(b) satisfies complementary slackness, i.e., X{Vi = 0, \zi = 0, 

so for each corresponding pair, 



p 




D 


variables x 




constraints 


Xi basic (xi ^ 0) 




=> constraint: tight (vi 0) 


Xi non-basic (xi = 0) 




=4> constraint: slack (vi ^ 0) 


constraints 




variables A 


constraint: tight (zi = 0) 


<= 


=4> Xi basic (X{ ^ 0) 


constraint: slack (zi ^ 0) 




=4> Xi non-basic (Xi 0) 



(These conditions determine which basic solutions in P and D are paired; the 
implications go both ways because in this example all basic solutions are non- 
degenerate.) 

2. There is only one pair that is feasible for both P and D, and that solution is C, 
which is optimal, with value 5, for both. 

3. For any x feasible for P and A feasible for D we have c T x < b T X with equality 
if and only if x, A are optima for P and D. 

This correspondence between P and D is so symmetrical and pretty that it feels as 
though it ought to be obvious why it works. Indeed we have already proved the 
following: 

Lemma 3.2 In linear programming, the dual of the dual is the primal. 

Theorem 3.3 (weak duality in LP). If x is feasible for P and A is feasible for D 
then c T x < b J A. (In particular, if one problem is feasible then the other is bounded.) 

Theorem 3.4 (sufficient conditions for optimality in LP). If x is feasible for P 
and A is feasible for D, and x, A satisfy complementary slackness, then x is optimal 
for P and A is optimal for D. Furthermore c T x = A T 6. 

The following will be proved in Lecture 7. 

Theorem 4.3 (strong duality in LP). // both P and D are feasible (each has at 
least one feasible solution), then there exists x, A satisfying the conditions of Theorem 
3.4 above. 
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5 The Simplex Method 

5.1 Preview of the simplex algorithm 

Let us look very closely at problem P and see if we can construct an algorithm that 
behaves as follows. 

Simplex algorithm 

1. Start with a basic feasible solution. 

2. Test — is it optimal? 

3. If YES - - stop. 

4. If NO, move to 'adjacent' and better b.f.s. Return to 2. 
We need to pick a b.f.s. to start. Let us take vertex A. 

x\ = X2 = 0; Z\ = 6, Z2 = 3. 

[Even for very large problems it is easy to pick a b.f.s. provided the original constraints 
are m constraints in n variables, Ax < b with b > 0. Once we add slack variables 
Ax + z = b we have n + m variables and m constraints. If we pick x = 0, z = b this 
is a b.f.s. More about picking the initial b.f. solutions in other cases later.] 
Now we can write problem P as: 

x\ +2x 2 +z\ = 6 (1) 

x\ - x 2 +Z2 = 3 (2) 

max x\ + X2 = f (0) 

The peculiar arrangement on the page is deliberate. Now it is obvious that A is not 
optimal because, 

1. At A, x\ = X2 = 0; Z\ = 6, Z2 = 3. 

2. From the form of the objective function we see that increasing either X\ or X2 
will improve the solution. 

3. From (1) we see that it is possible to increase x\ to 6 and decrease Z\ to 
without violating this equation or making any variable infeasible. 

4. From (2) we see that it is possible to increase x\ to 3 and decrease Z2 to before 
any variable becomes infeasible. 

5. Taking (1) and (2) together, then, we can increase x\ to 3, decrease z\ to 3 and 
decrease Z2 to 0, preserving equality in the two constraints, preserving feasibility 
and increasing the objective function. 
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That is, we should move to the b.f.s. 

x\ = 3, x 2 = 0, zi = 3, z 2 = 0, (/ = 3), 

which is vertex B. Note that one variable (x\) has entered the basis and one (z 2 ) has 
left; i.e., we have moved to an 'adjacent' vertex. Why was this easy to see? 

1. The objective function / was written in terms of the non-basic variables {x\ = 
X2 = 0), so it was easy to see that increasing one of them would improve the 
solution. 

2. Each basic variable (zi, z 2 ) appeared just once in one constraint, so we could con- 
sider the effect of increasing x\ on each basic variable separately when deciding 
how much we could increase X\. 

This suggests we try and write the problem so that the conditions above hold at 
B, our new b.f.s.. We can do this by adding multiples of the second equation to the 
others (which is obviously allowed as we are only interested in variables satisfying 
the constraints.) 

So P can be written, 

(1) -(2) 3x 2 +zi -z 2 = 3 (1)' 

(2) x x - x 2 +z 2 = 3 (2)' 
(0)-(2) 2x 2 -z 2 = f-3 (0)' 

This form of P (equivalent to the original) is what we wanted. 

1. The objective function is written in terms of the non-basic variables x 2 , z 2 . 

2. Basic variables xi, z\ each appear just once in one constraint. 
The next step is now easy. Remember we are at B: 

xi = 3, x 2 = 0, z\ = 3, z 2 = 0, (/ = 3). 

1. From the objective function it is obvious that increasing x 2 is good, whereas 
increasing z 2 would be bad (and since z 2 is zero we can only increase it). 

2. Equation (1)' shows that we can increase x 2 to 1 and decrease z\ to 0. 

3. Equation (2)' doesn't impose any restriction on how much we can increase x 2 
(we just would need to increase x\ also.) 

4. Thus we can increase x 2 to 1, while decreasing z\ to and increasing x\ to 4. 
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So we move to vertex C: 



xi = 4, x 2 = 1, zi = 0, z 2 = 0, (/ = 5). 
Now, rewriting the problem again into the desired form for vertex C we obtain 

(2) / + |(l) / xi +\zi +\z 2 

(oy - -|z 2 

Now it is clear that we have reached the optimum since 

1. We know xi = 4, x 2 = 1, z\ = 0, z 2 = is feasible. 

2. We know that for any feasible x, z we have / = 5 — |zi — < 5. So clearly 
our solution (with zi = z 2 = 0) is the best possible. 



= 1 
= 4 

= 7-5 



5.2 The simplex algorithm 

The procedure just described for problem P can be formalised. Rather than writing 
out all the equations each time we write just the coefficients in a table known as the 
simplex tableau. To repeat what we have just done, we would write:- 





Xi 


x 2 


Zl 


z 2 




zi basic 


1 


2 


1 





6 


z 2 basic 


1 


-1 





1 


3 


a 0] 


1 


1 












If we label the coefficients in the body of the table (%•), the right hand sides of the 
equations (a^o), the coefficients in the expression for the objective function as (aoj) 
and the value of the objective function — aoo, so the tableau contains 







aoj 





The algorithm is 

1. Choose a variable to enter the basis. Pick a j such that aoj > 0. (The 
variable corresponding to column j will enter the basis.) If all aoj < 0, j > 1, 
then the current solution is optimal. 

We picked j = 1, so xi is entering the basis. 
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2. Find the variable to leave the basis. Choose i to minimize a^ja^ from 
the set {i : > 0}. If < for all i then the problem is unbounded (see 
examples sheet) and the objective function can be increased without limit. If 
there is more than one i minimizing a^ja^ the problem has a degenerate basic 
feasible solution (see example sheet.) For small problems you will be OK if you 
just choose any one of them and carry on regardless. 

We choose % = 2 since 3/1 < 6/1, so the variable corresponding to equation 2 
leaves the basis. 

3. Pivot on the element a^. (i.e., get the equations into the appropriate form 
for the new basic solution.) 

(a) multiply row i by 1/a^. 

(b) add — (cLkj/dij) x (row i) to each row k ^ i, including the objective function 
row. 

We obtain as before 





Xi 


x 2 


Zl 


Z2 




zi basic 





3 


1 


-1 


3 


xi basic 


1 


-1 





1 


3 







2 





-1 


-3 



which is the appropriate form for the tableau for vertex B. 

Check that repeating these instructions on the new tableau, by pivoting on ai2, 
produces the appropriate tableau for vertex C. 





Xi 


X2 


Zl 


Z2 




X2 basic 





1 


1 

3 


1 

3 


1 


xi basic 


1 





1 
3 


2 
3 


4 










2 
3 


1 

3 


-5 



Note that the columns of the tableau corresponding to the basic variables always 
make up the columns of an identity matrix. 

Since the bottom row is now all < the algorithm stops. Don't hesitate to look 
back at Subsection 5.1 to see why we take these steps. 
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6 The Simplex Tableau 



6.1 Choice of pivot column 

We might have chosen the first pivot as a\2 which would have resulted in 





X\ 


x 2 




^2 




z\ basic 
Z2 basic 


1 
1 


2 
-1 


1 






1 


6 
3 


aoj 


1 


1 













X\ 


X 2 


Z\ 


Z2 




X2 basic 


1 

2 


1 


l 

2 





3 


Z2 basic 


3 
2 





1 

2 


1 


6 


a 0j 


1 

2 





1 

2 





-3 



This is the tableau for vertex D. A further iteration, with pivot a2i takes us to the 
optimal solution at vertex C. Therefore both choices of initial pivot column resulted 
in it requiring two steps to reach the optimum. 

Remarks. 

1. In general, there is no way to tell in advance which choice of pivot column will 
result in the smallest number of iterations. We may choose any column where 
aoj > 0. A common rule-of-thumb is to choose the column for which aoj is 
greatest, since the objective function increases by the greatest amount per unit 
increase in the variable corresponding to that column. 

2. At each stage of the simplex algorithm we have two things in mind. 

First — a particular choice of basis and basic solution. 

Second — a rewriting of the problem in a convenient form. 
There is always an identity matrix embedded in the tableau corresponding to 
the basic variables. Hence, when the non-basic variables are set to zero the 
equations are trivial to solve for the values of the basic variables. They are just 
given by the right-hand column. 

Check that provided we start with the equations written in this form in the 
initial tableau, the simplex algorithm rules ensure that we obtain an identity 
matrix at each stage. 

3. The tableau obviously contains some redundant information. For example, pro- 
vided we keep track of which equation corresponds to a basic variable, we could 
omit the columns corresponding to the identity matrix (and zeros in the objec- 
tive row). This is good for computer programs, but it is probably better to keep 
the whole thing for hand calculation. 
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6.2 Initialization: the two-phase method 

In our example there was an obvious basic feasible solution with which to start the 
simplex algorithm. This is not always the case. For example, suppose we have a 
problem like 

maximize —6x1 — 3x2 
subject to x\ + X2 > 1 
2x\ — X2 > 1 
3x 2 < 2 

Xi : X2 > 

which we wish to solve using the simplex algorithm. We can add slack variables 
(sometimes called surplus variables when they appear in > constraints) to obtain 

maximize —6x1 — 3^2 

subject to x\ + X2 — z\ = 1 

2x\ — X2 — Z2 = 1 

3x 2 + ^3 = 2 

Z% ^ 

but there is no obvious b.f.s. since z\ = — 1, Z2 = — 1, £3 = 2 is not feasible. 

The trick is to add extra variables called artificial variables, yi, 1/2 so that the 
constraints are 

x\ + x 2 - z\ + yi = 1 
2xi - x 2 - Z2 + 2/2 = 1 
3x 2 + ^3 = 2 

Phase I is to minimize y\ + y2 and we can start this phase with y\ = 1, y2 = 1 
and Z3 = 2. (Notice we did not need an artificial variable in the third equation.) 
Provided the original problem is feasible we should be able to obtain a minimum 
of with 1/1 = 2/2 = (since y\ and y2 are not needed to satisfy the constraints if 
the original problem is feasible). The point of this is that at the end of Phase I the 
simplex algorithm will have found a b.f.s. for the original problem. Phase II is then 
to proceed with the solution of the original problem, starting from this b.f.s. 

Note: the original objective function doesn't enter into Phase I, but it is useful to 
carry it along as an extra row in the tableau since the algorithm will then arrange 
for it to be in the appropriate form to start Phase II. 

Note also: the Phase I objective must be written in terms of the non-basic variables 
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to begin Phase I. This can also be accomplished in the tableau. We start with 





X\ 


X2 




Z2 


zz 


yi 


2/2 




yi 


1 


1 


-1 








l 





1 


U'2 


2 


-1 





-1 








1 


1 


Z3 





3 








1 








2 


Phase II 


-6 


-3 




















Phase I 

















-1 


-1 






Preliminary step. Add rows 1 and 2 to the Phase I objective so that it is written 
in terms of non-basic variables. 





X\ 


X2 


Z\ 


Z2 


zz 


2/1 


2/2 




yi 


1 


1 


-1 








1 





1 




2 


-1 





-1 








1 


1 


Z3 





3 








1 








2 


Phase II 


-6 


-3 




















Phase I 


3 





-1 


-1 











2 



Begin Phase I. 



xi x 2 zi z 2 z 3 yi y 2 



Pivot on 
<^2i to get 



in 

X\ 

z-s 



| -1 \ 1 -\ 

1 -\ o -1 \ 

3 1 


i 

2 
1 
2 

2 


0-60-3003 


3 


o I -i \ -1 


i 

2 



xi x 2 zi z 2 z 3 yi y 2 





Z2 





3 


-2 


1 





2 


-1 


1 


Pivot on 


X\ 


1 


1 


-1 








1 





1 


ai4 to get 


Zi 





3 








1 








2 









3 


-6 








6 





6 





















-1 


-1 






End of Phase I. y\ = j/2 = and we no longer need these variables (so we drop the 
last two columns and Phase I objective row.) But we do have a b.f.s. to start Phase 
II with (x\ = 1, Z2 = 1, Zs = 2) and the rest of the tableau is already in appropriate 
form. So we rewrite the last tableau without the 1/1,1/2 columns. 
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Begin Phase II. 







X2 




Z2 


Z3 









3 


-2 


1 





1 


X\ 


1 


1 


-1 








1 







3 








1 


2 







3 


-6 








6 



In one more step we reach the optimum, by pivoting on au- 



X\ X2 Z\ z 2 zz 



i -1 \ 

1 -\ -\ 
2-11 


i 

3 
2 
3 

1 


0-4-1 


5 



Notice that in fact, the problem we have considered is the same as the problem D, 
except that x replaces A and we have added a constraint 2^2 < 3 (which is not tight 
at the optimum). It is interesting to compare the final tableau with the final tableau 
for problem P (shown again below). 

In general, artificial variables are needed when there are constraints like 

< —1, or > 1, or = 1, 

unless constraints happen to be of a special form where it is easy to spot a b.f.s. 
If the Phase I objective does not reach zero then the original problem is infeasible. 
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7 Algebra of Linear Programming 



7.1 Sensitivity: shadow prices 



Each row of each tableau merely consists of sums of multiples of rows of the original 
tableau. The objective row = original objective row + scalar multiples of other rows. 
Consider the initial and final tableau for problem P. 





1 


2 


1 





6 







1 


l 

3 


1 

3 


1 


initial 


1 


-1 





1 


3 


final 


1 





1 

3 


2 
3 


4 




1 


1 



















2 
3 


1 
3 


-5 



In particular, look at the columns 3 and 4, corresponding to variables z\ and zi< We 
can see that 

Final row (1) = ~ initial row (1) — ~ initial row (2) 
Final row (2) = | initial row (1) +| initial row (2) 

Final objective row = initial objective row — | initial row (1) — | initial row (2). 
In particular, suppose we want to make a small change in 6, so we replace 



Providing e±, €2 are small enough they will not affect the sequence 



6 ) by ( 6 + 61 

3 J y V 3 + e 2 

of simplex operations. Thus if the constraints move just a little the optimum will 
still occur with the same variables in the basis. The argument above indicates that 
the final tableau will be 



1 l - - l - 

u 1 3 3 

1 I 1 


1 + 361 — 362 

4+ \e x + |e 2 


0-1-5 


—5 — |ei — |e2 



with corresponding solution x\ = 4+ |ei — ^62 and x^ = l + |ei + |e2 and objective 
function value 5 + |ei + |e2- If ei, 62 are such that we have x\ < or x^ < then 
vertex C is no longer optimal. 

The objective function row of the final tableau shows the sensitivity of the opti- 
mal solution to changes in b and how the optimum value varies with small changes in 
b. For this reason the values in the objective row are sometimes known as shadow 
prices. The idea, in the above example, is that we would be willing to a pay price 
of |ei for relaxation of the right hand side of the first constraint from 6 to 6 + 1\. 

Notice also that being able to see how the final tableau is related to the initial 
one without looking at the intermediate steps provides a useful way of checking your 
arithmetic if you suspect you have got something wrong! 
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Notice that for problem P the objective rows at the vertices A, B, C and D are: 



A 1 1 

5 2 -1 
COO 

D \ -I 



_ i 

3 3 





-3 
-5 
-3 




Compare these values with the basic solutions of the dual problem (on page 15). You 
will see that the objective row of the simplex tableau corresponding to each b.f.s. 
of problem P contains the values of the variables for a complementary slack basic 
solution to problem D (after a sign change). 

The simplex algorithm can (and should) be viewed as searching amongst basic 
feasible solutions of P, for a complementary-slack basic solution of D which is also 
feasible. 

At the optimum of P the shadow prices (which we can read off in the bottom row) 
are also the dual variables for the optimal solution of D. 

7.2 Algebra of the simplex method 

It is convenient to divide the variables into two sets, and to split the matrix A 
accordingly. For example, given 

/ an a\2 ai3 

\ CL2\ CL22 CL23 

we can partition the variables into two disjoint subsets (basic and non-basic) B = 
{1, 2} and A = {3} and rewrite the equation 

/ an «i2 

V a 2l &22 

or 

AbXb + A N x N = 6, 

where xb = ( Xl I contains the variables in B and xn = ( x^) contains the vari- 
\X2 J v ' 

ables in A, and Ab has the columns of A corresponding to variables in B (columns 
1 and 2) and An has columns corresponding to variables from A (column 3). 

You should convince yourself that the two forms of the linear equations are equiv- 
alent and that the same trick would work for general m x n matrices A and partition 
of variables into two sets. If A = (ai, . . . , a n ), where is the ith column, then 

Ax = diXi + diXi = AbXb + A^xn = b. 
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We usually choose B to have m elements (a basis) and N to have n — m elements. 
Then setting — 0, we solve AbXb = b where Ab is an m x m matrix to find the 
basic solution xb = A^b, xn = 0. 
Let us take problem P in the form 

P: maxc T x s.t. Ax = 6, x > 0. 

Given a choice of basis B we can rewrite the problem as above 

max {c^xb + cJjXn} s.t. AbXb + AnXn = b, xb,xn>0. 

At this stage we are just rewriting the equations in terms of the two sets of variables 
xb and xjy. The equations hold for any feasible x. Now Ab is invertible by our 
non-degeneracy assumptions in Assumption 4.1. Thus we can write 

x B = A^b - A^A N x N , (1) 

and 

/ = CbXb + cJjXn 

= c^{A^b - A B l A N x N ) + c^xn 

= c J B A B l b + {cj, - clA^-A N )x N (2) 

Equation (1) gives the constraints in a form where xb appears with an identity matrix 
and (2) gives the objective function in terms of the non-basic variables. Thus the 
tableau corresponding to basis B will contain (after appropriate rearrangement of 
rows and columns) 



basic non-basic 



I 


A B l A N 


A~ B l b 





cjsf ~ c ~bAb 1 An 


-clA'Jb 



Thus, given a basis (choice of variables) we can work out the appropriate tableau 
by inverting Ab- Note that for many choices of B we will find that A^b has negative 
components and so the basic solution xb = A B b is not feasible; we need the simplex 
algorithm to tell which Bs to look at. 

In programming a computer to implement the simplex algorithm you need only 
remember what your current basis is, since the whole tableau can then be computed 
from A^. The revised simplex algorithm works this way and employs various 
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tricks to compute the inverse of the new Ab from the old A B (using the fact that 
only one variable enters and one leaves). This can be very much more efficient. 

Now we know that the simplex algorithm terminates at an optimal feasible basis 
when the coefficients of the objective row are all < 0. In other words, there is a basis 
B for which 

c N - c T B A B l A N < 0. 

Recall the dual problem is 

D: minA T 6 s.t. ^ T A > c. 

Let us write A = (A^ 1 ) 1 cb- Then we have Aj^X = cb and A N X > c^, and hence A is 
feasible. Furthermore, xb = A B b, x n = is a basic solution for P and complemen- 
tary slackness is satisfied since 

c B - A T B X = 0=^(c B - A T B X) T x B = 0, 
xn = =4> (c/v — AJ{X) t xn = 0. 

Consequently, xb = A^b, %n = and A = (A~ 1 ) t cb are respectively optimal for 
the primal and dual. We also have that with these solutions 

/ = c B x B = c T B A B l b = X T b. 

So we have a proof of Theorem 4.3, that the primal and dual have the same objective 
value (if we accept that the simplex algorithm terminates at an optimum with < 
objective row) for the case of LP problems. 

Remark 

We have shown that in general the objective row of the final (optimal) tableau will 
contain cjy — X T A]\f in the non-basic columns, where A are dual variables. This 
is consistent with our observation that the final tableau the vector —A sits in the 
bottom row, of the columns corresponding to the slack variables. We start with a 
primal Ax < b and add slack variables z. In this case the the objective function is 
c T x + T z, so Ci = in columns corresponding to slack variables, and columns of 
which correspond to the slack variables are the columns of an identity matrix (since 
Ax + Iz = b). So the part of the objective row beneath the original slack variables 
will contain T — A 7 In = —X T , and A are the dual variables corresponding to the 
primal constraints. The rest of the objective row, beneath the original x, contains 
c\ — X 1 Ab, i.e., the values of the slack variables in the dual problem. This is what 
we observed in our earlier example. 
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8 Shadow Prices and Lagrangian Necessity 



8.1 Shadow prices 

Lagrangian multipliers, dual variables and shadow prices are the same things. Let 
us say a bit more about the latter. 

Suppose you require an amounts 61, ... , b m of m different vitamins. There are n 
foodstuffs available. Let 

dij = amounts of vitamin i in one unit of foodstuff j, 

and suppose foodstuff j costs Cj per unit. Your problem is therefore to choose the 
amount xj of foodstuff j you buy to solve the LP 

min^ CjXj 
3 

subject to djjXj > bi, each i 
3 

and Xj > each j. 

Now suppose that a vitamin company decides to market m different vitamin pills 
(one for each vitamin) and sell them at price pi per unit for vitamin i. Assuming 
you are prepared to switch entirely to a diet of vitamin pills, but that you are not 
prepared to pay more for an artificial carrot (vitamin equivalent) than a real one, 
the company has to maximize profit by choosing prices pi to 

max ^2 biPi 

i 

subject to cLjjPi < Cj, each j 
i 

and pi > each i. 

Note that this is the LP which is dual to your problem. The dual variable pi is the 
price you are prepared to pay for a unit of vitamin i and is called a shadow price. 
By extension, dual variables are sometimes called shadow prices in problems where 
their interpretation as prices is very hard (or impossible) to see. 

The dual variables tell us how the optimum value of our problem changes with 
changes in the right-hand side (6) of our functional constraints. This makes sense in 
the example given above. If you require an amount bi + e of vitamin % instead of an 
amount b{ you would expect the total cost of your foodstuff to change by an amount 
epi, where pi is the value to you of a unit of vitamin i, even though in your problem 
you cannot buy vitamin i separately from the others. 
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The above makes clear the relationship between dual variables/Langrange mulit- 
pliers and shadow prices in the case of linear programming. 

More generally, in linear problems we can use what we know about the optimal 
solutions to see how this works. Let us assume the primal problem 

P(b) : minc T a: 
s.t. Ax — z = 6, x, z > 0. 

has the optimal solution 0(6), depending on b. Consider two close together values 
of 6, say b' and 5", and suppose that optimal solutions have the same basic variables 
(so optimums occur with the same variables being non-zero, though the values of the 
variables will change slightly). The optimum still occurs at the same vertex of the 
feasible region though it moves slightly. Now consider the dual problem. This is 

max A T 6 
s.t. ^ T A < c, A > 0. 

In the dual problem the feasible set does not depend on 6, so the optimum of the dual 
will occur with the same basic variables and the same values of the dual variables A. 
But the value of the optimum dual objective function is A T 6' in one case and X T b" in 
the other and we have seen that the primal and dual have the same solutions. Hence 

0(6') = \ T b' and 0(6") = X T b" 

and the values of the dual variables A give the rate of change of the objective value 
with b. The change is linear in this case. 

The same idea works in nonlinear problems. 

Example 8.1. Recall Example 2.1, where we had constraints 

x\ + x 2 + x 3 = 5 

X l + X 2 = ^ 

and obtained values of Lagrange multipliers of X\ = — 2, A2 = —a/5/8. 
If we replace the constraints by 

xi + x 2 + £3 = h 
x 1 + x 2 = 6 2 

and write 0(6) = optimal value of the problem with 6 = (61, &2) T then you can check 
that 

75/8. 



dbi 



6=(5,4) db < 



6=(5,4) 
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Let P(b) be the problem: maximize f{x) : g(x) < 6, x G W 1 . Let 0(6) be its 
optimal value. 

Theorem 8.1. Suppose f and g are continuously differentiable on X = W 1 , and that 
for each b there exist unique 

• x*{b) optimal for P(b) , and 

• A* (6) G R m , X*(b) > such that 0(6) = sup x€X {f(x) + A*(6) T (6 - g(x))}. 
If x* and A* are continuously differentiable, then 

d(f)(b) 

Proof. 



= A* (6). (3) 



0(6) = L(x\ A*) = f(x*) + A*(6) T (6 - g(x*)) 
Since L(x*, A*) is stationary with respect to x*j, we have for each j, 

dL(x\\*) 



dx* 



= 



For each k we have either gk(x*) = 6^, or gk(x*) < bt- Since A*(6) T (6 — g(x*)) = 
we have in the later case, A|, = 0, and so d\* k /dbi = 0. So 



dbi dbi ^— ' dx* dbi 



On the r.h.s. above, the second term is and the first term is 
Now the second term on the r.h.s. above is 0, and so we have (3). □ 



8.2 Lagrangian necessity 

In the examples we have studied we have been able to find Lagrange multipliers A 
that work in the Lagrangian Sufficiency Theorem. We have also observed that the 
primal and dual problems have the same optimum values. It is outside the scope of 
the course to establish conditions under which we expect these results to hold, but 
we can give a brief summary. 

Let P(b) be the problem: minimize f(x) s.t. g{x) = 6 and x E X. Let 0(6) be its 
optimal value. Suppose that 0(6) is convex in 6, as shown in the figure. The convexity 
of implies that for each 6* there is a tangent hyperplane to 0(6) at b* with the graph 
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of (j)(b) lying entirely above it. This uses the supporting hyperplane theorem for 

convex sets, which is geometrically obvious, but some work to prove. In the (g,f) 
plane, the equation of this tanget hyperplane, through the point (6*, 0(6*)) can be 
written y(x) = 0(6*) + X T (g(x) — 6*) for some A. So f(x) — y(x) is minimized to 

(over all x G X) by taking x such that (g(x), f(x)) = (6*, 0(6*)). Equivalently, 
L(x, A) = f{x) — X T (g(x) — 6*) is minimized to 0(6*). 

Compare, for example, two problems: (i) minimize x 2 s.t. x = 6, and (ii) 
minimize x 2 s.t. x A = 6. In (i) we have 0(6) = b 2 and L(x,X) = x 2 — X(x — 6) is 
minimized at x = 6 when we take A = 26, whereas in (ii) we have 0(6) = 6 1//2 and 
there is no A such that L(x, A) = x 2 — X{x A — 6) is minimized at x = b, (since for 
A > the minimum is at x = oo, and for A < the minimum is at x = 0.) 

The following theorem gives simple conditions under which 0(6) is convex in 6. 

Theorem 8.2 (Sufficient conditions for Lagrangian methods to work). Let 

P(b) be the problem: minimize f (x) s.t. g(x) < 6 and x G X. If the functions f,g 
are convex, X is convex and x* is an optimal solution to P, then there exist Lagrange 
multipliers X G M m such that L(x* : A) < L(x,X) for all x G X . 

In particular, Lagrange multipliers always exist in linear programming programs, 
provided they have optimal solutions (i.e., are feasible and bounded). 

Assuming the supporting hyperplane theorem, the proof of Theorem 8.2 relies 
on showing that 0(6) is convex. To see this, suppose that Xi is optimal for P(bi), 

1 = 1, 2. Let x = Qxi + (1 - 6)x 2 , and b = Obi + (1 - 9)b 2 . Convexity of X 
implies x G X, and convexity of g implies g(x) < 6, so x is feasible for P(b). So 
0(6) < f{x) < 9f{xi) + (1 - 9)fix 2 ) = #0(6i) + (1 - #)0(6 2 ), where the second 
inequality follows from convexity of /. Thus 0(6) is convex in 6. 
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9 Two Person Zero-Sum Games 



9.1 Games with a saddle-point 

We consider games that are zero-sum, in the sense that one player wins what the 
other loses. The players make moves simultaneously. Each has a choice of moves 
(not necessarily the same). If player I makes move i and player II makes move j then 
player I wins (and player II loses) a^-. Both players know the mxn pay-off matrix 

A = (a l3 ). 







II plays j 






1 


2 3 


4 


1 


-5 


3 1 


20 


I plays i 2 


5 


5 4 


6 


3 


-4 


6 


-5 



Let us ask what is the best that player I can do if player II plays move j. 

IPs move: j = 12 3 4 

Fs best response: i = 2 3 2 1 

I wins 5 6 4 20 f- column maximums 

Similarly, we ask what is the best that player II can do if I plays move i? 

Fs move: i = 12 3 

IFs best response: j = 1 3 4 
I wins —5 4 — 5 ^— row minimums 

Here the minimal column maximum = min 7 max^ aij = maxj min^ = maximal 
row minimum = 4, when player I plays 2 and player II plays 3. In this case we say 
that A has a saddle-point (2, 3) and the game is solved. 

Remarks. The game is solved by 'I plays 2' and 'II plays 3' in the sense that 

1. Each player maximizes his minimum gain. 

2. If either player announces any strategy (in advance) other than T plays 2' and 
'II plays 3', he will do worse. 

3. If either player announces that he will play the saddle-point move in advance, 
the other player cannot improve on the saddle-point. 

9.2 Example: Two-finger Morra, a game without a saddle-point 

Morra is a hand game dating from Roman and Greek times. Each player displays 
either one or two fingers and simultaneously guesses how many fingers his opponent 
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will show. If both players guess correctly or both guess incorrectly then the game 
is a tie. If only one player guesses correctly, then that player wins from the other 
player an amount equal to the total number of fingers shown. A strategy for a player 
is (a, b) ='show a, guess b\ The pay-off matrix is 

(1,1) (1,2) (2,1) (2,2) 

(1.1) [0 2-3 

(1.2) -2 3 

(2.1) 3 -4 

(2.2) 0-340 

Column maximums are all positive and row minimums are all negative. So there 
is no saddle point (even though the game is symmetric and fair). If either player 
announces a fixed strategy (in advance), the other player will win. 

We must look for a solution to the game in terms of mixed strategies. 




9.3 Determination of an optimal strategy 

Each player must use a mixed strategy. Player I plays move i with probability pi, 
i = 1, . . . , m and player II plays moves j with probability qj, j = 1, . . . , n. Player I's 
expected payoff if player II plays move j is 

i 

So player I attempts to 

maximize ^ min y Vi a i) ^ s.t. y pi = 1, pi > 0. 



jmmy^a^ j s.t. ^]pj = 1, 



Note that this is equivalent to 

P: maxt> s.t. CLjjPi > v, each j, and ^~]pi = 1, Pi > 0, 

i i 

since v on being maximized will increase until it equals the minimum of the a>ijPi- 
By similar arguments, player IPs problem is 

D: min-y s.t. ^^dijqj < v, each i, and ^^Qj = 1, Qj > 0. 

j i 

It is possible to show that P and D are duals to one another (by the standard 
technique of finding the dual of P). Consequently, the general theory gives sufficient 
conditions for strategies p and q to be optimal. 

Let e denote a vector of Is, the number of components determined by context. 
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Theorem 9.1. Suppose p G M. m , q G MJ 1 , and v G WL, such that 

(a) p>0, e T p = 1, p 1 A > ve T (primal feasibility) ; 

(b) q>0, e T q = 1, Aq < ve (dual feasibility) ; 

(c) v =p T Aq (complementary slackness). 

Then p is optimal for P and q is optimal for D with common optimum ( the value 
of the gamej v. 

Proof. The fact that p and q are optimal solutions to linear programs P and D follows 
from Theorem 3.4. Alternatively, note that Player I can guarantee to get at least 

min p 1 Aq > mm(ve T )q = v, 
q q 

and Player II can guarantee that Player I gets no more than 

max p T Aq < m&xp 1 (ve) = w = v. 
p p 

In fact, (c) is redundant; it is implied by (a) and (b). □ 
Remarks. 

1. Notice that this gives the right answer for a game with a saddle-point 
(i.e., v = ai*j*, with p^ = qj* = 1 and other p^ qj = 0). 

2. Two-finger Morra has an optimal solution p = q = (0, |, §, 0), v = 0, as can 
be easily checked. E.g. p T A = (0,0,0,1/5) > Ox 1 T . It is obvious that we 
expect to have p = q and v = since the game is symmetric between the players 
(A = — A T ). A is called an anti-symmetric matrix. 

The optimal strategy is not unique. Another optimal solution is p = q = 
(0, |,|,0). Player I can play any mixed strategy of the form (0,0,1 — 0,0) 
provided f < 9 < f . 

3. These conditions allow us to check optimality. For small problems one can often 
use them to find the optimal strategies, but for larger problems it will be best to 
use some other technique to find the optimum (e.g., simplex algorithm). Note, 
however, that the problems P and D are not in a form where we can apply 
the simplex algorithm directly;-?; does not have a positivity constraint. Also 
the constraints are Yli a ijPi — v = with r.h.s.= 0. It is possible, however, to 
transform the problem into a form amenable to the simplex algorithm. 

(a) Add a constant k to each aij so that > each This doesn't change 
anything, except the value which is now guaranteed to be positive (v > 0). 
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(b) Change variables to X{ = Pi/v. We now have that P is 

max-y s.t. 2^ a ij x i ^ 1, Xj = 1/v, X{ > 0, 

i i 

which is equivalent to 

min Xj s.t. 'S^dijXi > 1, £j > 

i i 

and this is the type of LP that we are used to. 
9.4 Example: Colonel Blotto 

Colonel Blotto has three regiments and his enemy has two regiments. Both comman- 
ders are to divide their regiments between two posts. At each post the commander 
with the greater number of regiments wins one for each conquered regiment, plus one 
for the post. If the commanders allocate equal numbers of regiments to a post there 
is a stand-off. This gives the pay-off matrix 

Enemy commander 





(2,0) 


(1,1) 


(0,2) 


(3,0) 


3 


1 





Colonel (2,1) 


1 


2 


-1 


Blotto (1,2) 


-1 


2 


1 


(0,3) 





1 


3 



Clearly it is optimal for Colonel Blotto to divide his regiments (^,j) and (j, i) with 
equal probability. So the game reduces to one with the payoff matrix 







(2,0) 


(1,1) 


(0,2) 


(3,0) or 


(0,3) 


3 
2 


1 


3 
2 


(2,1) or 


(1,2) 





2 






To derive the optimal solution we can 

(a) look at player Colonel Blotto's original problem: maximize {mm, YliPi a ij}i i- e -, 
maximize^ min{|p, p + 2(1 —p)}, 

(b) attempt to derive p, q, v from the conditions of Theorem 9.1, or 

(c) convert the problem as explained above and use the simplex method. 

For this game, p = (|,|), q = (|, §, |) and v = | is optimal. In the original 
problem, this means that Colonel Blotto should distribute his regiments as (3,0), 
(2,1), (1,2), (0,3) with probabilities respectively, and his enemy should 

distribute hers as (2,0), (1,1), (0,2) with probabilities |, |, | respectively. 
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10 Maximal Flow in a Network 



10.1 Max-flow/min-cut theory 

Consider a network consisting of n nodes labelled 1, . . . , n and directed edges between 
them with capacities c^- on the arc from node % to node j. Let Xij denote the flow in 
the arc i — > j , where < < C{j . 



V 




V 



Problem: Find maximal flow from node 1 (the source) to node n (the sink) 
subject to the conservation of flow at nodes, i.e., 

maximize v s.t. < Xij < c^, for all i, j 

and 

( v if i = 1 

x^ — Xji — \ if i = 2, . . . , n — 1 
jeN jeN y —v if i = n 

where the summations are understood to be over existing arcs only, v is known as 
the value of the flow. 

This is obviously an LP problem, but with lots of variables and constraints. We 
can solve it more quickly (taking advantage of the special network structure) as 
follows. 

Definition 10.1. A cut (S,S) is a partition of the nodes into two disjoint subsets 
S and S with 1 G S and n G S. 

Definition 10.2. The capacity of a cut 

c(s,s)= 

Thus given a cut (£, S) the capacity of the cut is the maximal flow from nodes 
in S to nodes in S. It is intuitively clear that any flow from node 1 to node n must 
cross the cut (5, S), since in getting from 1 to n at some stage it must cross from S 
to S. This holds for any flow and any cut. 



flow out of 
node i 



flow into 
node i 
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Example 10.1. 




Cut S = {1, 3}, S = {2, 4}, C(S, S) = 3. Check that the maximal flow is 3. 
In fact, we have: 

Theorem 10.1 (Max flow/min cut Theorem). The maximal flow value through the 
network is equal to the minimal cut capacity. 

Proof. Summing the feasibility constraint 

{v if i = 1 

if i = 2, . . . , n — 1 

—v if z = n 

over i E S, yields 

v = y * — y ^ 

i£S 7 jeN jeN,i€S 

= Xjj — Xji 

iGSJeS jeS,i£S 

< C(S, S) 

since for all i,j we have < x^ < Cij. Hence the value of any feasible flow is less 
than or equal to the capacity of any cut. 

So any flow < any cut capacity, (and in particular max flow < min cut). 

Now let / be a maximal flow, and define S C N recursively as follows: 



(1) 1 G S. 

(2) If i €E S and < c^j, then j G S. 

(3) If i G S and Xji > 0, then j G S. 

Keep applying (2) and (3) until no more can be added to S. 
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So S is the set of nodes to which we can increase flow. Now if n G S we can 
increase flow along a path in S and / is not maximal. So n E S = N \ S and (S, S) 
is a cut. From the definition of S we know that for i G S and j G S, x^ = c\j and 
Xji = 0, so in the formula above we get 

v= 22 Xij - 22 x ji = c(s,s). 

i£S,jeS jeS,i£S 

So max flow = min cut capacity. □ 
Corollary 10.2. If a flow value v = cut capacity C then v is maximal and C minimal. 
The proof suggests an algorithm for finding the maximal flow. 

10.2 Ford-Fulkerson algorithm 

1. Start with a feasible flow (e.g., x^ = 0). 

2. Construct S recursively by the algorithm defined in the box above. 

3. If n G S then there is a path from 1 to n along which we can increase flow by 

e = min maxfxjj, — x^} > 0. 

(ij) 

where the minimum is taken with respect to all arcs % — > j on the path. 
Replace the flow by this increased flow. Return to 2. 
If n G' S then the flow is optimal. 

The algorithm is crude and simple; we just push flow through where we can, until 
we can't do so anymore. There is no guarantee that it will be very efficient. With 
hand examples it is usually easy to 'see' the maximal flow. You just demonstrate 
that it is maximal by giving a cut with the same capacity as the flow and appeal to 
the min cut = max flow theorem. 

The algorithm can be made not to converge if the capacities are not rational 
multiples of one another. However, 

Theorem 10.3. // capacities and initial flows are rational then the algorithm termi- 
nates at a maximal flow in a finite number of steps. ( Capacities are assumed to be 
finite.) 

Proof. Multiply by a constant so that all capacities and initial flows are integers. 
The algorithm increases flow by at least 1 on each step. □ 
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Example: Failure to stop when capacities and initial flows are not rational 




The network consists of a square bcde of directed arcs of capacity 1. The corners of 
the square are connected to a source at a and a sink at / by arcs of capacity 10. The 
initial flow of \ + w is shown in the first picture, where w = (V5 — 1)/2, so 1—w = w 2 . 
The first iteration is to increase flow by w along a — > c — > b — > e — > d — > f. The 
second increases it by w along a — > d — > e — > b — Y f. The flow has increased by 2w 
and the resulting flow in the square is the same as at the start, but multiplied by w 
and rotated through 180°. Hence the algorithm can continue in this manner forever 
without stopping and never reach the optimal flow of 40. 

10.3 Minimal cost circulations 

Definition 10.3. A network is a closed network if there is no flow into or out of 
the network. 

Definition 10.4. A flow in a closed network is a circulation ifYlj x ij ~ Ylj x ji = 

for each node i. 

Most network problems can be formulated as the problem of finding a minimal 
cost circulation in a closed network where there are capacity constraints c- < 
Xij < cfj on arcs and a cost per unit flow of d{j in arcs The full problem 

is 

minimize dijXij 

subject to ^^Xij — ^^Xji = 0, each i, and < < cfj. 

j i 

Definition 10.5. A circulation which satisfies the capacity constraints is called a 
feasible circulation. 

There is a beautiful algorithm called the out-of-kilter algorithm which will solve 
general problems of this kind. It does not even require a feasible solution with which 
to start. In the next lecture we shall just derive conditions for a flow to be optimal. 

We shall also see, although it should be obvious already, that the max flow problem 
that is studied in this lecture can be formulated as a minimal cost circulation problem. 
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11 Minimum Cost Circulation Problems 



11.1 Sufficient conditions for a minimal cost circulation 

Recall the minimum cost circulation problem: 

minimize dijXij 
subject to Xij — Xji = 0, each i, and < Xij < c 



Consider the Lagrangian for the problem of finding the minimum cost circulation. 
We shall treat the capacity constraints as the region constraints, so 

X = i X ij : c ij < x ij < 4j}- 

We introduce Lagrange multipliers Aj (one for each node) and write 



L(x, A) = d H x ij - J2 A M Yl x ^ ~ Yl 

ij i 

Rearranging we obtain 

L(x, A) = ~ + \ 



x Jl 



yj)Xij. 



'J 



We attempt to minimize L(x, A) in X 

Provid 
such that 



Provided q-, cfj are finite we see that there is a finite minimum for all A, achieved 



/ c ij if dfj-Ai + Aj>0 ( 

" I <■;, if d ij -\ i + \ j <0 [L) 

cij < Xij < cfj if dij - X t + Xj = 0. (2) 

Theorem 11.1. If (x^j) is a feasible circulation and there exists A such that (x^-),A 
satisfy conditions (1) and (2) above, then (x^) is a minimal cost circulation. 

Proof. Apply the Lagrangian sufficiency theorem. □ 

Definition 11.1. The Lagrange multipliers Aj are usually known as node numbers 
or potentials in network problems. 

Definition 11.2. Aj — Xj is known as the tension in the arc 
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11.2 Max-flow as a minimum cost circulation problem 



The maximal flow problem we studied earlier can be set up as a minimal cost 
circulation problem. For each arc in the network we assign a capacity constraint 
< xij < c~lj and all cost d\j = 0. Add an arc from node n to 1 with no capacity 
constraint and cost —1. 



S \ S 




cost — 1 



The cost of the circulation is —v, so minimizing — v is the same as maximizing v. 
Let us seek node numbers Aj which will satisfy optimality for this problem. Since 
arc (n, 1) has no capacity constraints, for a finite optimum we will require 

d n i — A n + Ai = Ai = X n + 1. 

Let us set A n = 0, Ai = 1. (Since it is only the differences in the As that matter, we 
can pick one arbitrarily.) Let (S, S) be a minimal cut. Assign Aj = 1 for i E S and 
Xi = for i G S. Now check that 

(a) For i,j G S or i : j G S => dij — Xi + Aj = so can take any feasible value. 

(b) For % G S, j G S we have 

dij — Xi + Xj = - 1 + = — 1 =)> = cjj. 

(c) For i G S, j G S we have 

dij - Xi + Xj = - + 1 = 1 => x^ = 0. 

But conditions (a)-(c) are precisely those satisfied by a maximal flow and minimal 
cut. 

If we like, we can say that the Ford- Fulker son algorithm in looking for a cut is 
trying to find node numbers and a flow to satisfy optimality conditions. 

Remark 

In many problems it is natural to take = 0, = oo. In this case we will achieve 
a finite optimum only if dij — Xi + Xj > for each arc. 
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Theorem 11.2. For a minimal cost circulation problem with capacity constraints 
< Xij < oo on each arc if we have a feasible circulation (xij) and node 

numbers Xi such that 

dij — Xi + Xj > 0, each and 
= if d^ — Xi + Xj > 0, 

then (x^) is optimal. 

Proof. Apply the Lagrangian sufficiency theorem. □ 

Note: The optimality conditions imply (d^ — Xi + Xj)xij = in this case (comple- 
mentary slackness). 



11.3 The transportation problem 

Consider a network representing the problem of a supplier who has n supply depots 
from which goods must be shipped to m destinations. We assume there are quantities 
si, . . . , s n of the goods at depots {Si, . . . , S n } and that the demands at destinations 
D m } are given by di, . . . , d m . We also assume that Y2i s i = ^2j dj so that 
total supply = total demand. Any amount of goods may be taken directly from 
source i to destination j at a cost of d^ (i = 1, . . . , n; j = 1, . . . , m) per unit. One 
formulation of the problem is 

minimize dijXij 
subject to x^ = Si each i, ^^%ij = dj each j 

j i 
with x^ > each 

Here the flow from Si to Dj. The network looks like: 



m destinations 



d\ 
d 2 

dm 
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with arcs (i, j), < X(j < oc, and cost dij per unit flow. 
The Lagrangian for the problem can be written 



Xij 




L(x : \ : jj,) = ^2 dijXij E 

ij i \ j / j 

where we label Lagrange multipliers (node numbers) A^ for sources and fij for destina- 
tions. (We choose the sign of fij in this apparently unusual way since it is convenient 
to think of the demands dj as being negative supplies. In Section 12.2, we describe 
he simplex-on-a-graph algorithm, for a problem in which we suppose that there is a 
supply bi at each node i.) Rearranging, 

L(x, A, fi) = y^djj - A 4 + fij)xij + 22 ^ ~ ^2 ^3 d 3- 

ij i j 

This will have a finite minimum in x\j > 0, and the minimum occurs with (d^ — 
Xi + fij) Xij = on each arc. Thus the Lagrangian sufficiency theorem give the same 
optimality conditions as before. 

Theorem 11.3. A flow x\j is optimal for the transportation problem if 3 A.;, fij such 
that d^ — Xi + fij > each (i, j) and (d^ — Xi + fij)xij = 0. 

Proof. The Lagrangian sufficiency theorem applies. □ 
Remark 

It is no surprise that the same optimality conditions appear as in the minimal cost 
circulation problem. If we augment the transportation network by connecting all 
sources and all destinations to a common 'artificial node' by arcs where the flow is 
constrained to be exactly that which is required (and zero cost) we obtain the same 
problem as in minimal cost circulation form. 

< x^ 




arcs Si < xoi < Si 11/ /j^£gtS£~§^*\\ \ arcs 4? — x j0 < dj 
cost dm = \\ \ I j j } j cos t djo = 



The optimality conditions on the extra arcs are automatically satisfied by a feasible 
flow since XjQ = dj, x^ = Si regardless of node numbers. 
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12 Transportation and Transshipment Problems 
12.1 The transportation algorithm 



1. Set out the supplies and costs in a table as below 

Di D 2 D 3 D A 

Si 



S 2 

s 3 





5 




3 




4 




6 




to 




7 




4 




1 




5 




6 




2 




4 



10 

9 



6 



2. Allocate an initial feasible flow (by North- West corner rule or any other sensible 
method). NW corner rule says start at top left corner and dispose of supplies 
and fulfill demands in order i, j increasing. In our case we get 



6 


5 


2 


3 




4 




6 




2 


3 


7 


7 


4 




1 




5 




6 


1 


2 


8 


4 



6 5 8 8 

In the absence of degeneracy (which we assume) there are not less than (m+n— 1) 
non-zero entries, which appear in a 'stair-case' arrangement. 

Remark. In our network picture we have constructed a feasible flow on a 
spanning tree of m + n — 1 arcs connecting n sources and m destinations. 
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A set of undirected arcs is spanning if it connects all nodes. It is a tree if it 
contains no circuits. A spanning tree is the equivalent of a basic solution for 
this problem. 

3. For optimality we require dij—Xi+fij = on any arc with non-zero flow. Set Ai = 
(arbitrarily) and then compute the remaining Aj, fij by using dij — Xi + fij = 
on arcs for which Xij > 0. On the table we have 



K \ 


-5 




-3 









-2 







6 


5 


2 


3 




4 




6 


4 




2 


3 


7 


7 


4 




1 


2 




5 




6 


1 


2 


8 


4 



The node numbers are also shown on the network version above. With non-zero 
flows forming a spanning tree we will always be able to compute uniquely all 
node numbers given one of them. 

4. We now compute \ — fij for all the remaining boxes (arcs) and write these 
elsewhere in the boxes. E.g., 



K \ fj-j 


-5 




-3 









-2 







6 


5 


2 


3 





4 


2 


6 


4 


9 


2 


3 


7 


7 


4 


6 


1 


2 


7 


5 


5 


6 


1 


2 


8 


4 



5. If all > Xi — fij, then the flow is optimal. Stop. 

6. If not, (e.g., i = 2, j = 1, where A2 — [i\ = 9 > d^\ = 2) we attempt to 
increase the flow in arc (i, j) for some (i, j) such that A^ — fij > d^. We seek an 
adjustment of +e to the flow in arc (?, j) which keeps the solution feasible (and 
therefore preserves total supplies and demands). In our case we do this by 



6-e 


2 + e 








+e 


3-e 


7 











1 


8 
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and pick e as large as possible (without any flow going negative) to obtain a 
new flow (for e = 3). 

There is only one way to do this in a non-degenerate problem. The operation 
is perhaps clearer in the network picture. 




We attempt to increase flow in the dotted arc. Adding an arc to a spanning 
tree creates a circuit. Increase flow around the circuit until one arc drops out, 
leaving a new spanning tree. The new solution is 




7. Now return to step 3 and recompute node numbers, of 



-5 




-3 




-7 




-9 




3 


5 


5 


3 


7 


4 


9 


6 


3 


2 





7 


4 





6 


1 





5 


-2 


6 


1 


2 


8 


4 



In our example we obtain A« = 0, —3, —5 and fij = —5, —3, —7, —9 at the next 
stage. The expression dij — Xi+fij < for = (1,3), (2,4) and (1,4). Increase 
the flow in (2,4) by 7 to obtain the new flow below. This is now optimal, as we 
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can check from the final node numbers: 



-5 




-3 




-2 




-4 




3 


5 


5 


3 


o 


4 


A 

4: 


6 


3 


2 





7 


-1 


4 


7 


1 


5 


5 


3 


6 


8 


2 


1 


4 



Remark. The route around which you may need to alter flows can be quite compli- 
cated though it is always clear how you should do it. For example, had we tried to 
increase the flow in arc (3, 1) instead of (2, 1) at step 5 we would have obtained 



6-e 


2 + e 











3-e 


7 + e 





+e 





1 - e 


8 



To summarise: 

1. Pick initial feasible solution with m + n — 1 non-zero flows (NW corner rule). 

2. Set Ai = and compute A^, fij using dij — \ + fij = on arcs with non-zero 
flows. 

3. If dij — Aj + fij > for all then flow is optimal. 

4. If not, pick for which — Aj + fij < 0. 

5. Increase flow in arc by as much as possible without making the flow in any 
other arc negative. Return to 2. 

12.2 *Simplex-on-a-graph* 

The transportation algorithm can easily be generalised to a problem of minimizing 
costs in a general network in which there is a constraint < Xij < oo on each directed 
arc and a flow bi enters the network at each node i (though it is hard to keep 

track of all the numbers by hand). Here we don't label sources and destinations 
separately, but do allow bi > and bi < 0. Clearly, bi = for conservation 
of flow. The simplex-on-a-graph algorithm solves this problem in an identical 
fashion to the transportation algorithm. Once again a basic solution is a spanning 
tree of non-zero flow arcs. Suppose there are n nodes. 

1. Pick an initial basic feasible solution. Obtain n — 1 non-zero flow arcs. 
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2. Set Ai = and compute X{ on other nodes using dij — Aj + Xj = on arcs of the 
spanning tree. 



3. Compute — \ + Xj for other arcs. If all these are > then optimal. If not, 
pick such that d\j — Xi + Xj < 0. 



4. Add arc to the tree. This creates a circuit. Increase flow around the circuit 
(in direction of arc until one non-zero flow drops to zero and a new basic 

solution is created. Return to 2. 



If it is hard to find an initial basic solution then there is a two-phase version of the 
algorithm (just as for the ordinary simplex algorithm). 

Phase I. Add artificial node with arcs from all nodes with bi > and to all nodes 
with bi < 0. For Phase I objective put costs of 1 on all arcs to and from node 
and costs elsewhere in the network. The initial basic solution for Phase I is the 
spanning tree consisting of the node and all arcs joining it to the original nodes. 

At the end of Phase I (if the original problem was feasible) you will have reduced 
the Phase I cost to (no flow in arcs to or from node 0), so have a basic solution 
(Spanning tree solution) for the original problem. 

Phase II. Solve the original problem using the initial solution found by Phase I. 

There are several applications of the network theory to problems of graph theory 
and operations research on the further examples sheet. 



12.3 Example: optimal power generation and distribution 

The following real-life problem can be solved by the simplex-on-a-graph algorithm. 
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The demand for electricity at node i is d{. Node % has k{ generators, that can generate 
electricity at costs of an, ... , a^, up to amounts bn, . . . , bi^. There are n = 12 nodes 
and 351 generators in all. The capacity for transmission from node i to j is Cy (= Cj^). 

Let Xij = amount of electricity carried i — > j and let = amount of electricity 
generated by generator j at node i. The LP is 



minimize ^ 

ij 

subject to ^2 V%3 ~ ^2 Xi i + x 3* = * = 1, • ■ ■ , 12, 



< Xy < Cfj, < < 6 



>j- 



In addition, there are constraints on the maximum amount of power that may be 
shipped across the cuts shown by the dotted lines in the diagram. 
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